# GUI Visual Positioning
GUI Actor 7B Qwen2 VL
MIT
GUI-Actor-7B is a vision-language model developed based on Qwen2-VL-7B-Instruct, focusing on graphical user interface (GUI) agent tasks and providing a coordinate-free visual grounding solution.
Multimodal Fusion
Transformers

G
microsoft
207
14
Uground V1 2B
Apache-2.0
UGround is a powerful GUI visual positioning model trained using a simple method, jointly developed by OSUNLP and Orby AI.
Multimodal Fusion
Transformers English

U
osunlp
975
8
Uground
UGround is a powerful GUI visual positioning model trained with a streamlined recipe, developed by the Ohio State University NLP Group in collaboration with Orby AI.
Image-to-Text
U
osunlp
208
23
Featured Recommended AI Models